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SURFACE RECONSTRUCTION AND REGISTRATION WITH A HELMHOLTZ 

RECIPROCAL IMAGE PAIR 

BACKGROUND OF THE INVENTION 

Image Reconstruction 

[0001] The problem of reconstructing or determining a shape from shading 
has received significant attention from the computer vision community. Determining 
monocular shape from shading using multiple images has culminated in numerous 
practical systems for consumer applications. This progress was made possible by 
developments in many associated fields such as determining shape from specularities, 
shape from interreflections, reflectance modeling and extraction. 

[0002] Recently there has been in the art successful uses of Helmholtz 
reciprocity in stereo reconstruction. This principle determines that the bi-directional 
reflectance distribution function (BRDF) of a surface is symmetric on the incoming 
and outgoing angles. The original statement by von Helmholtz referred only to 
specular reflections. A more general result was later stated by Lord Rayleigh. It is 
interesting to note that, for generic surfaces, the derivation of the Helmholtz 
reciprocity principle from basic physics has been established only recently. 

[0003] By swapping the positions of a camera and a light source, the principle 
of Helmholtz reciprocity can be exploited to recover the shape of surfaces with 
arbitrary BRDFs. Currently, multiple Helmholtz pairs are required to obtain 
satisfactory image reconstruction results. 

[0004] Thus, what is needed in the art is a reconstruction algorithm for 
continuous surfaces that makes use of Helmholtz reciprocity without resorting to 
multiple image pairs. 

Image Registration 

[0005] In many applications, a known 3D surface model needs to be aligned 
with a corresponding object's location in space. When the object is arbitrarily placed 
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before a set of cameras, the model can be registered with the imagery. In this way the 
pose of the object can be determined with respect to the cameras. One example of 
such a scenario is an industrial parts inspection system in which, a gauging system is 
used to align a CAD model of a known part with the imagery of a newly 
manufactured item. Image and model registration are also useful in the medical 
arena. For example, CT scans of patients are used to generate sets of iso-surfaces that 
are then aligned with images of the patients on an operating table. This is generally 
an initial step in image-guided surgery. 

[0006] There are a number of approaches that have been used to accomplish 
surface-to-imagery registration. If range images in the coordinate system of the 
cameras can be generated, algorithms such as Iterative Closest Point (ICP) can be 
used. Range images can be acquired using various strategies such as textured light 
reconstruction or dense stereo reconstruction. Textured light techniques such as laser 
striping may require the object to be stationary for extended periods of time and can 
be confounded by various types of surface finishes (such as polished metal and 
fiberglass composites). Most dense stereo reconstruction algorithms rely on surface 
texture to establish image-to-image correspondences, however, not all objects to be 
imaged include surface texture. 

[0007] An alternative approach is to view the 3D surface structure as a 
generative model for the resulting imagery. If an initial estimate of the object pose 
appears reasonable, gradient descent can be used to improve the pose estimate by 
minimizing the difference between the generated and observed images. By making 
simple assumptions regarding the source of illumination, the surface can be rendered 
into a given camera frame. In an earlier work, a model of surface topography and 
terrain reflectance was used to align satellite images. Others have shown that mutual 
information (MI) is a reasonable metric for measuring the difference between the 
generated and observed image. A stereo pair of images has been employed to 
generate the intensity values of one image based on the intensity values of another 
image. This is done by projecting each point on the surface into both images and 
transferring the intensity value found in the first image into the generated second 
image. Although this approach is more realistic than direct surface rendering, it is not 
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exact, in that it assumes a Lambertian model for surface reflectance. Therefore, what 
is needed in the art is a more exact generative model based on Helmholtz reciprocity 
for image registration. 

[0008] The above discussed and other features and advantages of the 
embodiments will be appreciated and understood by those skilled in the art from the 
following detailed description and drawings. 

BRIEF DESCRIPTION OF THE INVENTION 

[0009] Disclosed herein in an exemplary embodiment is a method of image 
reconstruction comprising: obtaining a Helmholtz reciprocal pair of images of an 
object, the Helmholtz reciprocal pair of images comprising a first image and a 
corresponding reciprocal image; determining an imaging geometry associated with 
the obtaining of the Helmholtz reciprocal pair of images; selecting a plurality of 
points in the first image and identifying corresponding candidate points in the 
corresponding reciprocal image; matching a selected point of the plurality of points 
and a candidate point of the corresponding candidate points. 

[0010] Also disclosed herein in another exemplary embodiment is a method of 
image registration with an object comprising: obtaining a Helmholtz reciprocal pair 
of images of an object, the Helmholtz reciprocal pair of images comprising a first 
image and a corresponding reciprocal image; estimating a pose for the object; 
predicting an estimated image corresponding to the pose and one image of the 
reciprocal pair of images; comparing the estimated image with a corresponding actual 
image from the pair of images; and refining the estimating a pose based on the 
comparing. 

[0011] Further disclosed herein in yet another exemplary embodiment is a 
storage medium encoded with a machine-readable computer program code, the code 
including instructions for causing a computer to implement the abovementioned 
method for image reconstruction. 
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[0012] Also disclosed herein in an exemplary embodiment is a computer data 
signal comprising: the computer data signal comprising code configured to cause a 
processor to implement the abovementioned method for image reconstruction. 

[0013] In yet another exemplary embodiment a system for image 
reconstruction is disclosed. The system comprising: a means for obtaining a 
Helmholtz reciprocal pair of images of an object, the Helmholtz reciprocal pair of 
images comprising a first image and a corresponding reciprocal image; a means for 
determining an imaging geometry associated with the obtaining of said Helmholtz 
reciprocal pair of images; a means for a plurality of points in the first image and 
identifying corresponding candidate points in the corresponding reciprocal image; and 
a means for matching a selected point of the plurality of points and a candidate point 
of said corresponding candidate points. 

[0014] Disclosed herein in another exemplary embodiment is a storage 
medium encoded with a machine-readable computer program code, the code 
including instructions for causing a computer to implement the abovementioned 
method for image registration with an object. 

[0015] Disclosed herein in yet another exemplary embodiment is a computer 
data signal: the computer data signal comprising code configured to cause a processor 
to implement the abovementioned method for image registration with an object. 

[0016] Finally, disclosed herein in another exemplary embodiment is a system 
for image registration with an object comprising: a means for obtaining a Helmholtz 
reciprocal pair of images of an object, said Helmholtz reciprocal pair of images 
comprising a first image and a corresponding reciprocal image; a means for 
estimating a pose for the object; a means for predicting an estimated image 
corresponding to the pose and one image of the reciprocal pair of images; a means for 
comparing the estimated image with a corresponding actual image from the pair of 
images; and a means for refining the estimating a pose based on the comparing. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[0017] Figure 1 depicts a simplified diagram of a system for capturing and 
reconstructing images in accordance with an exemplary embodiment; 

[0018] Figure 2 depicts a simplified diagram depicting a methodology for 
reconstructing images in accordance with an exemplary embodiment; 

[0019] Figure 3 provides a diagrammatic depiction of a Helmholtz reciprocity 
as applied in an exemplary embodiment; 

[0020] Figure 4 provides a graphical illustration of dynamic programming and 
defining a cost function in accordance with an exemplary embodiment; 

[0021] Figure 5 A provides a diagrammatic depiction of the impact of the 
ordering constraint; 

[0022] Figure 5B provides a diagrammatic depiction of the impact of the 
ordering constraint coupled with the Helmholtz stereopsis in accordance with an 
exemplary embodiment; 

[0023] Figure 6 provides a graphical representation of the simplification 
resulting from Helmholtz stereopsis on occlusions and shadows; 

[0024] Figure 7 depicts a Helmholtz image pair with a specularity and 
mapping along the epipolar lines; 

[0025] Figure 8 is a flowchart depicting a registration methodology in 
accordance with another exemplary embodiment; 

[0026] Figure 9 depicts results with and Crms metric for four different objects 
at different poses; 

[0027] Figure 10 depicts results with and Cms metric for four different objects 
at different poses; 
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[0028] Figure 1 1 depicts results with and C M i metric for four different objects 
at different poses; and 

[0029] Figure 12 depicts a point cloud distribution on several exemplary 
images depicting the optimization of the pose prediction for registration. 

[0030] The present invention will now be described, by way of an example, 
with references to the accompanying drawings, wherein like elements are numbered 
alike in the several figures in which: 

DETAILED DESCRIPTION OF THE INVENTION 

[0031] Referring to Figures 1 and 2, disclosed herein is a method 100 and 
system 10 for three-dimensional reconstruction of surfaces that takes advantage of the 
symmetry resulting from alternating the positions of a receiver 12 e.g., a camera, and 
the like, herein after denoted as camera and a source 14, e.g. light source, lamp and 
the like hereinafter denoted as light source 14. This set up allows for the use of the 
Helmholtz reciprocity principle to recover the shape of and object 30 including 
smooth surfaces with arbitrary bi-directional reflectance distribution functions without 
requiring the presence of texture, as well as for exploiting mutual occlusions between 
images. 

[0032] For a single image pair, the key idea is to approximate the intersection 
of a given epipolar plane and the surface with a piecewise linear curve. This 
formulation provides the local context needed to estimate the components of the 
surface normals that are contained in the epipolar plane so that for a given point on 
the surface, the intensity response in the first image can be predicted from the 
intensity response in the second image. A cost function based on the overall 
prediction error is established, and the optimal approximating polygon is found using 
dynamic programming. In addition, mechanisms for dealing with specularities, image 
saturation regions of high curvature, shadow and occlusions are described. 

[0033] Similar to many traditional dense reconstruction algorithms, the 
methodology utilizes dynamic programming to find an optimal matching along 
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epipolar lines. However, it does not require any unrealistic assumption about the 
BRDF of the scene, such as that it satisfies a Lambertian or Phong model. The 
matching is driven by the predictions of intensity values from one image to the other, 
which are then validated against direct image measurements. Advantageously, over 
the prior art, the methodology disclosed herein successfully employs as few as one 
reciprocal light/camera pair. Moreover, the methodology recovers surface depth and 
orientation simultaneously by determining a global minimum of an error function via 
dynamic programming. Advantageously, since the error is a function not just of depth 
but of surface orientation as well, the image reconstruction is subject to tighter 
geometric constraints than existing techniques, and as a result, fitting to local noise is 
avoided because it would induce a costly global deformation in the reconstruction. 

[0034] In an exemplary embodiment, given a current estimate of surface 
geometry and intensity measurements in one image, Helmholtz reciprocity is used to 
predict the pixel intensity values of the second image of a reciprocal image pair. A 
dynamic program finds the reconstruction that minimizes the total difference between 
the predicted and measured intensity values. This approach allows for the 
reconstruction of surfaces displaying specularities and regions of high curvature, 
which is a challenge commonly encountered in the optical inspection of industrial 
parts. 

[0035] It will be appreciated that Helmholtz reciprocity has been introduced 
into computer vision in the context of dense image reconstruction. Although dense 
reconstruction followed by an ICP algorithm can be implemented to facilitate 
registration, it may readily be observed that given one image of a Helmholtz stereo 
pair and the surface in the correct pose, the second Helmholtz image can be generated 
exactly. Thus, in an exemplary embodiment reconstruction followed by ICP may be 
avoided and replaced by an estimated pose and comparison of a predicted image with 
the actual image and optimizing the estimated pose based on the comparison. 



7 



133843-1 



Image Reconstruction 

[0036] Continuing with the drawings, Figure 2 depicts a simplified block 
diagram implementing a methodology 100 for reconstructing images in accordance 
with an exemplary embodiment. The methodology 100 initiates as depicted at 
process block 110 with acquiring a Helrnholtz reciprocal image pair. The image pair 
may be instantly captured images, or stored images captured at some other time. In 
one exemplary embodiment, a camera 12 and light source 14 are positioned at known 
locations denoted optical centers relative to the object 30 and a first image is captured 
and stored. The camera 12 and light source 14 are then swapped and a second image 
is captured. In an exemplary embodiment a computer 20 and appropriate interfaces 
are utilized to facilitate the image capture, storage, and processing. 

[0037] Continuing with Figure 1 and the methodology 100, at process block 
120, the geometry associated with the image capture is determined and an epipolar 
geometry is established. The geometry is based on the physical location of the 
camera 12 and light source 14 relative to the object 30. In an exemplary embodiment 
the geometry is established by calibrating the position and orientation of the camera 
12 and light source 14 relative to the object 30. In addition an epipolar geometry 
based on the optical centers is computed. Turning now to process block 130, a 
plurality of points on the epipolar line in the first image are selected and a 
corresponding candidate points in the corresponding reciprocal image are identified. 
Corresponding epipolar lines in the corresponding reciprocal images are selected. On 
the epipolar lines, adjacent points in the first image are employed to establish 
matching points on the corresponding epipolar line in the second reciprocal image. 

[0038] Finally, at process block 140 the points are matched along the 
corresponding epipolar lines. To perform the matching, the depths (distance from 
point p to an optical center Ci, C2) and normals for given candidate matches are 
determined. The Helrnholtz reciprocity principle is applied to facilitate prediction of 
intensity for matched points. The predicted intensity for a given point is compared 
with the measured intensity for the same point on the second reciprocal image and 
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minimized via an iterative dynamic programming process to minimize the error in the 
prediction. 

[0039] To facilitate description of the disclosed embodiment, a summary of 
the mathematical background is provided. Referring now to Figure 3 as well, the 
BRDF of a point p on a surface is defined, for a light ray at an incoming direction vi, 
the ratio between the outgoing radiance at a direction V2 and the radiance of the 
incoming light ray, and it is denoted by p(p, Vi, v 2 ). Helmholtz reciprocity implies 
that p(p, Vi, v 2 ) = p (p, v 2 , vi). Consider now a camera 12 and a point light source 14 
arbitrarily positioned. Let Vi be the unit vector pointing from p to the optical center Ci 
of the camera 12, and v 2 the unit vector pointing from p to the location c 2 of the light 
source 14. The radiance I 1>2 received by the camera 12 from p will be given by 
equation (1): 

*ia (P) = *7P(P> v i > v 2 ) n • v 2 7, l ~~^T > (!) 

II C 2-P|| 

where n is the surface normal at p and rj is a scale factor. Similarly, if the positions of 
the camera 12 and the light source 14 are swapped, the new radiance I 2 ,i received by 
the camera 12 will be 

A.i (P) = W(P> v i > v 2 ) n ' v i : — ^T • (2) 

Fi"P| 

Substituting equation (1) into equation (2), given n and the measured intensity Ii, 2 , an 
estimate of the intensity of the corresponding pixel value in the other image may 
readily be computed as 

n- vjc, -p|| 2 

I - (3) 



n v 2 ||c, -p 

[0040] It will be appreciated that equation (3), based on Helmholtz reciprocity 
is independent of the BRDF of the surface. Therefore, from equation (3) it can be 
seen that by acquiring a pair of images in which the positions of the camera 12 and 
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the light source 14 are swapped (a reciprocal pair), the knowledge of the surface 
orientation and depth for a given point allows for any pixel intensity in one image to 
be predicted from the other image regardless of the BRDF of the surface. 

Global Matching via Intensity Prediction 

[0041] Matching algorithms may be employed using dynamic programming to 
reconstruct the intersection of an epipolar plane and a continuous surface, producing a 
global matching of points in an epipolar line in one image against a corresponding 
epipolar line in another image. The basic idea is to create a grid where each column is 
associated with an image point on the epipolar line of the first image and each row is 
associated with an image point of the epipolar line in the second image. Each node in 
the graph represents a point in space that is defined by the intersection of the rays of 
the two image points. The rows and columns are ordered based on position along the 
epipolar lines so that a monotonically increasing path through the grid constitutes a 
valid reconstruction. This approach is known as satisfying an ordering constraint. 
Under the assumption of a Lambertian reflectance model, (e.g., a constant BRDF), the 
cost associated with a step from a node a to a node b is the cost at node a plus a 
normalized correlation error between an intensity window centered at the points 
associated with node b. 

[0042] Referring now to Figure 4 as well, Helmholtz reciprocity can be used 
to define a cost function C(a, b), where a and b are nodes from consecutive columns 
in the dynamic programming grid, that does not depend on the BRDF. The nodes a 
and b define two points in space p a and p b . Points on the line segment between these 
two points are given by 



Assuming smoothness of the surface to be reconstructed, this line segment can be 
used to approximate the surface so that 



p(x) = *P a + 



(1 -x)p b ; 0 <_jc <1 



(4) 




(5) 
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where 

C(x) = (/ 2il (p(x)) - hi (P(x))) 2 , (6) 

is the cost associated with the point p(x), f 2 ,\ is defined in equation 3 and 7 2 ,i is 
directly measured in the image. 

[0043] Observe also that hi depends on the surface normal n, which is not 
available a priori. Let n = + n e , where n e is the projection of the normal vector n 
on the epipolar plane, and n° is the projection of n on the direction orthogonal to the 
epipolar plane. The unit vectors Vi and v 2 defined in the previous section are given by 

y = Ci-PQ) andy = c 2 -p(x) 

II c i-pW| 2 ll c 2-pw ,r 



Since vi and v 2 are both in the epipolar plane, 

Vj -n^Vj-n 6 and v 2 n = v 2 n e (8) 
and n e can be approximated, up to an unknown scale a, as 
fi « (P fl -P^) x ( v i xv 2) 

ii(p a -pj x ( v i xv 2)ir ( 9 ) 

Although it would be preferable to use a higher order approximation, this would 
violate the conditions needed for the applicability of dynamic programming as a tool 
for optimizing the function in (5). Substituting (9) in (8) and then in (3), one obtains 

/^■/..w i/tti - (,o > 

[0044] Everything on the right hand side of equation (10) is either measured 
from capturing the second image, or derived from the camera geometry and the 
surface approximation defined by the points p a and p b . 

[0045] One important observation is that in the traditional approach to dense 
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matching through dynamic programming the only effect previous matches have over 
future ones is by the enforcement of the ordering principle. This constraint limits the 
range of available matches. In the methodology disclosed herein in an exemplary 
embodiment, besides the constraint established from the ordering principle, the prior 
match will reflect on the value of the normal n e , since it depends on both p b and p a . It 
will be appreciated that this results in a stronger coupling of the matches and enforces 
geometric consistency, which places tighter constraints on the reconstruction. As a 
result, the technique presented herein should be less sensitive to local noise than 
existing methodologies. Figures 5 A and 5B provide a diagrammatic depiction of the 
impact of the ordering constraint coupled with the Helmholtz stereopsis in accordance 
with an exemplary embodiment. 

[0046] Figures 5A and 5B illustrate the fundamental difference between 
traditional correlation based matching versus the approach disclosed herein. Two 
cameras 12 with optical centers in ci and C2 observe the surface shown as a solid black 
line. The acceptable region corresponds to the area for which the position of the 
midpoint on the surface does not violate the ordering constraint. If a perturbation is 
forced on this midpoint the rest of the reconstruction for the traditional approach is 
unchanged, as shown by the dashed line in Figure 5A. Advantageously, with the 
methodology introduced herein this is not the case because the change in depth is 
coupled to the estimation of the surface normal, according to equation (10). This 
produces a global change in the rest of the reconstruction, as depicted in Figure 5B, 
and a significant increase in the global cost function defined by equation (6). 
Consequently, the methodology introduced can be more resistant to perturbations 
induced by local noise. 

[0047] Continuing with Figures 4 and 5B, in an exemplary embodiment, each 
column and row of a dynamic programming matrix represents a ray shot from the first 
and second camera 12, respectively. The column and row spacing define the distance 
between consecutive rays as they pierce their corresponding image planes. The 
reconstruction is then generated from the intersection of the row and column rays. 
The surface normals are calculated from line segments connecting points on 
consecutive column rays. In most dense matching applications that use dynamic 
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programming, the row and column spacing are equally sampled. It will be 
appreciated that when the reconstruction is only concerned with depth equal sampling 
may be adequate. However, for an exemplary embodiment to ensure accurate 
approximation for surface normals, the column spacing should be set approximately 
20 times greater than the row spacing. However, it will also be appreciated that the 
column and row spacing requirements have a direct impact on the execution time of 
the algorithm. The complexity of the dynamic program is O (N c N 2 r ) where N c is the 
number of columns and N r is the number of rows. Since N r »2(W C , a direct 
implementation of this methodology would require approximately 400 times the 
processing of a standard dense matching algorithm. Therefore, to address this issue, 
in another exemplary embodiment, a scale space approach has been adopted. 
Multiple iterations of dynamic program are performed. Initially, the row and column 
spacing are set to relatively large values resulting in a coarse reconstruction. 
Subsequent iterations use tighter row and column spacing, however the number of 
possible matches in the next reconstruction is limited to a neighborhood of the match 
obtained in the prior reconstruction such that N r is effectively 10 percent of its full 
value. For each iteration, this reduces the run time by a factor of 100. The process is 
continued until the desired resolution is achieved. 

[0048] Turning to Figure 6, a significant feature of the Helmholtz stereopsis 
and geometry is that any point, which is simultaneously visible and illuminated in one 
image, must also be visible and illuminated in the other image. The interchange of 
the camera 12 and light source 14 locations produces a mutual occlusion effect: an 
occluding contour in one image will correspond to a zone of shadow in the other 
image. With standard stereopsis, shadows and occlusions are independent. However, 
with Helmholtz stereopsis, shadows on one epipolar line in one image directly map as 
an occlusion on the corresponding epipolar line of the corresponding image and vice 
versa. Therefore, the matching of points along epipolar lines may be carried out as a 
dynamic programming problem along matching segments of the epipolar lines lying 
between an occluding contour and the beginning of a shadow, and the end points of 
the segments will already be in correspondence. More particularly, points that are 
classified as not visible, for not meeting an intensity threshold, are deleted from the 
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dynamic programming matrix. Image points are then grouped into contiguous regions 
of visibility. If the image points in either image for a given pair of nodes a and b 
belong to different regions, the cost function C(a f b) is set to zero. In this way the 
continuity constraints are not enforced over regions where they do not apply. 
Advantageously, application of this principle leverages the effects of shadows and 
occlusions to further simplify the processing issues associated with the reconstruction 
and further enhance the robustness of the image reconstruction. 

[0049] Turning now to yet another feature of the disclosed exemplary 
embodiments, it will be appreciated that employing Helmholtz stereopsis, the 
difficulties previously associated with specularities, saturation and blooming may 
readily be mitigated. In particular, specularities have confounded traditional dense 
matching algorithms based on static illumination because the position of the 
specularity shifts depending on camera position. It should be appreciated, that this is 
not the case with Helmholtz reconstruction, because specularities are fixed in the 
surface to be reconstructed, as shown in Figure 7, in fact, facilitating the matching of 
points along the epipolar lines. 

[0050] However, due to limitations in camera dynamic range, it is likely that 
specularities will also produce image saturation (a limitation in the capability of a the 
camera), corrupting the intensity profile along the saturated region, and the cost 
function C(x) defined in equation (6) will be invalid. In this situation, a reasonable 
cost criteria may be re-defined based on the observation that, since both images show 
saturation, the sum S(x) I,(x) 2 + I 2 (x) 2 is as large as possible. Making the 
approximation: 



S(x) ~ J3 



jn«.vJ + M] 



(11) 



where 6 is a constant if one assumes that the point p(x) is approximately 
equidistant from the two cameras, it can be shown that S(x) is maximal when n e 
bisects vi and V2. This result is in agreement with the fact that specular reflections 
should occur only when the angles that the incident and reflected light make with the 
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local surface normal are approximately the same. Thus, if the two projections of p(x) 
are saturated, the cost function C(x) previously defined in equation (6) is set to 



[0051] Many industrial applications require measurements of surfaces with 
extremely high curvature. An example of this is the need to determine the position of 
a point on the leading edge of a fan blade where the radius of curvature is on the order 
of 0.01 inches. A Helmholtz stereopsis leading edge measurement system has been 
implemented and deployed on the factory floor. Comparison against standard 
Coordinate Measurement Machines result in an RMS reconstruction error on the order 
of 0.0012 inches. 

[0052] Consider also the possibility of performing a dense reconstruction on 
high curvature regions. Since the surface normals along an epipolar plane change 
rapidly, the column and row spacing in the dynamic programming grid should 
preferably be maintained much smaller than a pixel. 

[0053] A natural extension to this work is to use a parametric model of the 
surface to be reconstructed. In this case, the normal vectors could be directly 
extracted from the current estimate of the surface, and a global nonlinear optimization 
algorithm would be applied to the surface shape parameters to minimize the error in 
the prediction of the intensity values according to equation (10). Observe that that 
would significantly reduce the number of parameters to be optimized, but, since the 
estimation of the surface shape would have to be carried out by a nonlinear 
optimization technique, it would be necessary that a good initial estimate of the 
surface shape were available. Advantageously, the initial estimate could be provided 
by the non-parametric method previously described. 

Image Registration 

[0054] Disclosed herein in yet another exemplary embodiment, is a 
methodology for image/model registration. Referring now to Figure 8, the 
methodology 200 may be summarized as three primary processes: prediction of the 
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model appearance as depicted at process block 202; comparison of the predicted 
appearance against observed images as depicted at process block 204; and refinement 
of the model pose to optimize the match between the predicted and observed images 
as depicted at process block 206. 

Helmholtz Prediction 

[0055] Referring once again to Figures 1, 2, 3, and 8, given a three- 
dimensional (3D) point cloud model of an object 30, the Helmholtz configuration is 
used to generate a predicted image. Initially, a light source 14 is positioned at c 2 , and 
a camera 12 at Ci captures an image of the object 30. The positions of the camera 12 
and light source 14 are then switched, and a second image is acquired to establish the 
previously mentioned Helmholtz reciprocal image pair as depicted at optional process 
block 201. 

[0056] Next, the pose of the object model is estimated as depicted at process 
block 202. For a given model point p, the distances from the point to the camera 
centers Ci and c 2 are computed. Additionally, the surface normal n at point p is 
determined. Advantageously, it should be appreciated that it is also possible, and 
computationally more efficient, to pre-compute the surface normals n at each point p. 
The viewing directions vi and v 2 associated to the camera centers Ci and c 2 are 
computed. A ray is then projected from Ci to p. The intensity of the pixel through 
which the ray passes is recorded as Ii >2 . Finally, utilizing equation (3), /z/.is 
computed. This is the predicted intensity of the same data point as seen by the camera 
at c 2 . This procedure is repeated for each data point in the model, generating a 
prediction of the image as seen from c 2 . It is beneficial to observe that this prediction 
is in agreement with a complete modeling of the surface's BRDF, without requiring 
the BRDF to be explicitly measured. 

Image Comparison and Metrics 

[0057] Continuing now with Figure 8, at process block 204, the comparison of 
the predicted appearance against observed images may be based on one or more 
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metrics. In an exemplary embodiment, several image comparison metrics as they 
relate to the Helmholtz generative paradigm are considered. It will readily be 
appreciated that the choice of an appropriate metric for image comparison may exert 
significant influence over registration convergence and accuracy. The most direct 
way to measure image dissimilarity is by the root mean square of pixel differences, 
Crms, given by: 



where N is the number of pixels. Figure 9 shows the value of this metric for four 
different objects at different poses, each of which has unique geometrical and textural 
properties. Position zero indicates perfect registration. 

[0058] Let t = [00 0] T in centimeters and = [0 0 0] T in degrees be vectors 
representing the correct pose of the objects. The plots in Figure 9 show results of 
equation (13) when the pose of the object is perturbed in arbitrary translational and 
rotational directions, denoted by sAt with || At || = 8 cm and sA0 with || A 8 || = 14°, 
respectively, for different values of the parameter s, which measures how big the 
perturbed pose deviates from the optimal one. This corresponds to a one-dimensional 
slice of the full six-dimensional SE(3) manifold in which the pose parameters lie, and, 
therefore, cannot offer a full picture of the optimization landscape. However, it 
should be noted that on this slice at least the correct pose corresponds to the minimum 
of equation (13). 

[0059] It is clear that any gradient-based optimization algorithm would have 
difficulties in converging to the true solution in the case of the fish data set. The fish 
data set is from a highly textured surface, with a background with the same material 
and colors as the foreground. This problem calls for a different type of similarity 
measure, such as the median of the square of the pixel differences, which should 
produce a metric G M s more robust to image outliers 




(13) 



€ M s = median((/ 2 (^) - l2{x,y))\ y ) 



(14) 
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[0060] Figure 10 shows values of G M s for the same four objects as in Figure 9. 
Note that the cost curve for the blade has a minimum, which is displaced from the 
optimal alignment position. On the other hand, the Grms cost curve of the same 
object is quite smooth, and has a minimum very close to the position of optimal 
alignment. The model is perturbed from the optimal positioning the same way as 
used to produce the results in Figure 9. It will be appreciated that, the shape of Cms 
for this object may be because the blade is also almost textureless. These results 
suggest that registration should be performed using Grms when dealing with objects 
characterized by smooth, textureless surfaces, and Cms is preferred for when highly 
textured surfaces are concerned. 

[0061] Yet another metric may be employed that would depend on the spatial 
distribution of image intensities is the mutual information Gmi> expressed as 

G mi = H [J 2 ] + H [I 2 ] - H[/ 2 , 1 2 ] (1 5) 

where Hfx] is the entropy of the random variable x. 

[0062] Figure 1 1 shows the mutual information between a predicted and 
actual image using both the Helmholtz generative approach and the Lambertian 
approximation scheme. In all cases, the methodology disclosed herein provides more 
information than a Lambertian model. Additionally, the Lambertian scheme fails to 
identify the correct model pose for some instances e.g., the fish and the doll's head, 
whereas the Helmholtz method succeeds in doing so for all four objects. 

[0063] It will readily be appreciated that the choice of an appropriate metric 
for image comparison may exert significant influence over registration convergence 
and accuracy. The methods and results provided herein should be understood to be 
exemplary only to illustrate the effect of a selected metric. While the results provided 
suggest that registration should be performed using Grms when dealing with objects 
characterized by smooth, textureless surfaces, and G M s is preferred for when highly 
textured surfaces are concerned, other metrics are possible. In fact, numerous 
methodologies may be employed for optimization of the estimate of the pose 
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including, but not limited to, a gradient descent, Monte Carlo, an exhaustive search, 
and the like, as well as combinations including at least one of the foregoing. 

Pose Optimization 

[0064] Once a prediction of the model appearance is compared against an 
actual image, the difference between the two may be employed to drive an 
optimization algorithm to refine the pose of the model as depicted at process block 
206 of Figure 8. This can be carried out by optimizing any of the cost functions such 
as those illustrated in equations (13) or (15), where I 2 (x, y), for all pixel coordinates 
(x, y), is a function of the same parameters R and t, corresponding to a rotation matrix 
and a translation vector with respect to the initial pose of the model. The dependency 
of I 2 (x, y) on the orientation R and location t of the model is made explicit in equation 
(3), since p = p (R, t), n = n(R), and (x, y) are the coordinates of the projection of the 
point p, i.e., (x,y) = x = x(p). Therefore, the pose (R,t) of the model can be obtained 
as 

0R,f) = argmax J (I 2 (x(p(R,t)))- I 2 x(p)(R,t)) 2 (16) 

(R>t) p 

or 

(R,t) = arg max median {(L(jc(/;(if,0))- I 2 x(p)(R,t))) 2 }. (17) 

(*>o p 

[0065] In the examples shown in this work the optimization method adopted 
to solve (16) or (17) was conjugate gradient, with derivatives computed via finite 
differences, although many other options are possible. 

[0066] To begin the optimization process it is necessary to have an initial 
estimation of the pose that is close enough to the true position so that the optimization 
algorithm will converge. For the registration of industrial parts it is usually the case 
that a good initial guess is readily available. For tracking applications, it is customary 
to postpone the initialization problem, and at every iteration, the current estimation of 
the pose provides an initial guess for the next iteration that should be close to the 
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ground truth. 

[0067] Because the Helmholtz reciprocity principle yields an exact generative 
model, there should be zero difference between the predicted and observed images 
given perfect alignment. Generally, there will be a discrepancy between the predicted 
image and the actual image as seen by the camera 12 at c 2 . This is a result of model 
misalignment; which can be quantified using RMS (the root of mean squared 
differences), LMS (the median of squared differences), or MI (mutual information). 
With a properly chosen metric, conjugate gradient is used to update the model's 
transformation matrix. After the model is re-positioned in the scene, another 
predicted image is generated, and the cost of the current orientation is again 
computed. This series of steps is repeated until convergence is reached. 

[0068] In order to validate the technique introduced here, a series of 
experiments was performed. A Helmholtz stereo pair was established by placing 
point light sources 14 as close as possible to the optical center of two identical 
cameras 12, but avoiding the lights from being occluded by the cameras themselves. 
Three images were acquired for four objects, one with the lights off, to measure 
ambient light; and two images for transposed lights 14 and cameras 12. The 
background image was then subtracted from each image in the Helmholtz pair to 
eliminate ambient light contributions. A 3D model for each object was obtained by 
sweeping the object with a laser stripe and performing stereo reconstruction. The 3D 
points of the model were then perturbed from they original position by a translation of 
2.0 cm in each of the jc, y and z directions, and by a rotation of 10° (degrees) around 
each of the jc, y and z axes. It should be noted that this corresponds to a total 
translation of 6.9 cm and a rotation of 17.3°. Since the cameras 12 used in the 3D 
model reconstruction were the same used for the registration, without any change in 
position or parameters, optimal alignment is obtained with zero translation and 
rotation. The matrix R was represented through an exponential map, e.g., R = 
exp([n>] x ), where [w] x is the anti-symmetric matrix built from the entries of w such 
that [h>] x jc = w x jc for all values of jc. The direction of w is the axis around which the 
rotation is performed and the magnitude of w is the rotation angle. The algorithm 
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described herein in accordance with an exemplary embodiment was employed, and 
good alignment (final translation of 2 mm and final rotation of 1°) using the median of 
the difference of pixel intensity as a metric was achieved for all data sets except for 
the fish, which had to be initialized with translations of 1.0 cm in the x y and z 
directions, as well as rotation of 3° around the x, y and z axes. This corresponds to a 
total translation of 1.7 cm and a rotation of 5.20. As a quick experiment to verify the 
robustness of the registration to the initial pose of the model, the initial translations 
and rotations applied to the models were multiplied by -1, and the registration 
algorithm was rerun. Again, convergence within 2 mm and 1° was obtained. Figure 
11 shows initial and final pose for each object. The difficulty in convergence for the 
fish model can be attributed to the cluttered background, which has the same texture 
as the fish itself, and to small size of this model. 

[0069] The disclosed exemplary embodiments introduce a technique for 
registration of 3D models to 2D images based on Helmholtz reciprocity. By 
exploiting this principle the methodology facilitates prediction of the appearance of 
the back projected model in agreement with its BRDF without that having to 
explicitly know the BRDF. This is a great advantage over techniques, which assume 
a Lambertian model, valid only for certain types of surfaces. In particular, such 
algorithms are not capable of handling shinny of specular surfaces. After the 
appearance of the model has been predicted, a suitable image metric is used to 
quantify the discrepancy between predicted and observed images. Since the predicted 
image should be in agreement with the BRDF of the object, this discrepancy can be 
attributed to misalignment of the object 30, and it can therefore drive a search for 
optimal registration parameters. The effectiveness of this algorithm was 
demonstrated in a number of registration experiments with different objects, as well 
as by comparison with mutual information. 

[0070] The disclosed invention can be embodied in the form of computer or 
controller 20 implemented processes and apparatuses for practicing those processes. 
The present invention can also be embodied in the form of computer program code 
containing instructions embodied in tangible media 16, such as floppy diskettes, CD- 
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ROMs, hard drives, or any other computer-readable storage medium, wherein, when 
the computer program code is loaded into and executed by a computer or controller 
20, the computer 20 becomes an apparatus for practicing the invention. The present 
invention can also be embodied in the form of computer program code embodied as a 
data signal 18, for example, whether stored in a storage mediuml6, loaded into and/or 
executed by a computer or controller, or transmitted over some transmission medium, 
such as over electrical wiring or cabling, through fiber optics, or via electromagnetic 
radiation, wherein, when the computer program code is loaded into and executed by a 
computer, the computer becomes an apparatus for practicing the invention. When 
implemented on a general-purpose microprocessor, the computer program code 
segments configure the microprocessor to create specific logic circuits. 

[0071] It will be appreciated that the use of first and second or other similar 
nomenclature for denoting similar items is not intended to specify or imply any 
particular order unless otherwise stated. 

[0072] While the invention has been described with reference to an exemplary 
embodiment, it will be understood by those skilled in the art that various changes may 
be made and equivalents may be substituted for elements thereof without departing 
from the scope of the invention. In addition, many modifications may be made to 
adapt a particular situation or material to the teachings of the invention without 
departing from the essential scope thereof. Therefore, it is intended that the invention 
not be limited to the particular embodiment disclosed as the best mode contemplated 
for carrying out this invention, but that the invention will include all embodiments 
falling within the scope of the appended claims. 
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